home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 26
/
Cream of the Crop 26.iso
/
educate
/
trutran2.zip
/
ARTICLE3
< prev
next >
Wrap
Text File
|
1996-12-14
|
24KB
|
500 lines
*** PRESS ANY KEY TO SEE THE NEXT SCREEN ***
If you wish to print this article or view it in its
entirety, please load it into your word processor
as ARTICLE3.
*********************************
* For an overview of these *
* articles, please first read *
* the file ARTICLE0.SEE *
*********************************
Where Do Translators Fit Into Machine Translation?
Original And Supplementary Questions
By Alex Gross
Submitted to the MT SUMMIT III Conference
June 3, 1991 2:00 -- 3:45 PM
(and later published in the Sci-Tech Translation
Journal, April, 1993)
Here are the original questions for this panel as submitted to the
speakers:
1. At the last MT Summit, Martin Kay stated that there should
be "greater attention to empirical studies of translation so that
computational linguists will have a better idea of what really goes
on in translation and develop tools that will be more useful for the
end user." Does this mean that there has been insufficient input
into MT processes by translators interested in MT? Does it mean
that MT developers have failed to study what translating actually
entails and how translators go about their task? If either of these
is true, then to what extent and why? New answers and insights for
the MT profession could arise from hearing what human translators
with an interest in the development of MT have to say about these
matters. It may well turn out that translators are the very people
best qualified to determine what form their tools should take, since
they are the end users.
2. Is there a specifically "human" component in the
translation process which MT experts have overlooked? Is it
reasonable for theoreticians to envision setting up predictable and
generic vocabularies of clearly defined terms, or could they be
overlooking a deep-seated human tendency towards some degree of
ambiguity--indeed, in those many cases where not all the facts are
known, an inescapably human reliance on it? Are there any viable MT
approaches to duplicate what human translators can provide in such
cases, namely the ability to bridge this ambiguity gap and improvise
personalized, customized case- specific subtleties of vocabulary,
depending on client or purpose? Could this in fact be a major
element of the entire translation process? Alternately, are there
some more boring "machine-like" aspects of translation where the
computer can help the translator, such as style and consistency
checking?
3. How can the knowledge of practicing translators best be
integrated into current MT research and working systems? Is it to
be assumed that they are best employed as prospective end- users
working out the bugs in the system, or is there also a place for
them during the initial planning phases of such systems? Can they
perhaps as users be the primary developers of the system?
4. Many human translators, when told of the quest to have
machines take over all aspects of translation, immediately reply
that this is impossible and start providing specific instances which
they claim a machine system could never handle. Are such reactions
merely the final nerve spasms of a doomed class of technicians
awaiting superannuation, or are these translators in fact
enunciating specific instances of a general law as yet not fully
articulated? Since we now hear claims suggesting that FAHQT is
creeping in again through the back door, it seems important to ask
whether there has in fact ever been sufficient basic mathematical
research, much less algorithmic underpinnings, by the MT Community
to determine whether FAHQT, or anything close to it, can be achieved
by any combination of electronic stratagems (transfer, AI, neural
nets, Markov models, etc.). Must translators forever stand exposed
on the firing line and present their minds and bodies to a broadside
of claims that the next round of computer advances will annihilate
them as a profession? Is this problem truly solvable in logical
terms, or is it in fact an intractable, undecidable, or provably
unsolvable question in terms of "Computable Numbers" as set out by
Turing, based on the work of Hilbert and Goedel? A reasonable
answer to this question could save boards of directors and/or
government agencies a great deal of time and money.
SUPPLEMENTAL QUESTIONS
It was also envisioned that a list of Supplemental Questions
would be prepared and distributed not only to the speakers but
everyone attending our panel, even though not all of these questions
could be raised during the session, so as to deepen our discussion
and provide a lasting record of these issues.
FAHQT: Pro and Con
Consider the following observation on FAHQT: "The ideal notion
of fully automatic high quality translation (FAHQT) is still lurking
behind the machine translation paradigm: it is something that MT
projects want to reach." (1) Is this a true or a false observation?
Is FAHQT merely a matter of time and continued research, a
direct and inevitable result of a perfectly asymptotic process?
Will FAHQT ever be available on a held-held calculator-sized
computer? If not, then why not?
To what extent is the belief in the feasibility of FAHQT a form
of religion or perhaps akin to a belief that a perpetual motion
device can be invented?
Technical Linguistic Questions
Let us suppose a writer has chosen to use Word C in a source
text because s/he did not wish to use Word A or Word B, even though
all three are shown as "synonyms." It turns out that all three of
these words overlap and semantically interrelate quite differently
in the target language. How can MT handle such an instance, fairly
frequently found in legal and diplomatic usage?
Virtually all research in both conventional and computational
linguistics has proceeded from the premise that language can be
represented and mapped as a linear entity and is therefore eminently
computable. What if it turns out that language in fact occupies a
virtual space as a multi-dimensional construct, including several
fractal dimensions, involving all manner of non-linear turbulence,
chaos, and Butterfly Effects?
Post-Editors and Puppeteers
Let's assume you saw an ad for an Automatic Electronic
Puppeteer that guaranteed to create and produce endless puppet
plays in your own living room. There would be no need for a
puppeteer to run the puppets and no need for you even to script
the plays, though you would have the freedom to intervene in the
action and change the plot as you wished. Since the price was
acceptable, you ordered this system, but when it arrived, you
found that it required endless installation work and calls to the
manufacturers to get it working. But even then, you discovered
that the number of plays provided was in fact quite limited, your
plot change options even more so, and that the movements of the
puppets were jerky and unnatural. When you complained, you were
referred to fine print in the docs telling you that to make the
program work better, you would have to do one of two things: 1)
master an extremely complex programming language or 2) hire a
specially trained puppeteer to help you out with your special
needs and to be on hand during your productions to make the
puppets move more naturally. Does this description bear any
resemblance to the way MT has functioned and been promoted in
recent years?
A Practical Example
Despite many presentations on linguistic, electronic and
philosophical aspects of MT at this conference, one side of
translation has nonetheless gone unexplored. It has to do with
how larger translation projects actually arise and are handled by
the profession. The following story shows the world of human
translation at close to its worst, and it might be imagined at
first glance that MT could easily do a much better job and simply
take over in such situations, which are far from atypical in the
world of translation. But, as we shall see, such appearances may
be deceptive. To our story:
A French electrical firm was recently involved in a hostile
take-over bid and law suit with its American counterpart. Large
numbers of boxes and drawers full of documents all had to be
translated into English by an almost impossible deadline.
Supervision of this work was entrusted to a paralegal assistant
in the French company's New York law firm. This person had no
previous knowledge of translation. The documents ran the gamut
from highly technical electrical texts and patents, records of
previous law suits, company correspondence, advertisements,
product documentation, speeches by the Company's directors, etc.
Almost every French-to-English translator in the NYC area was
asked to take part. All translators were required to work at the
law firm's offices so as to preserve confidentiality. Mere
translation students worked side by side with newly accredited
professionals and journeymen with long years of experience. The
more able quickly became aware that much of the material was far
too difficult for their less experienced colleagues. No
consistent attempt was made to create or distribute glossaries.
Wildly differing wages were paid to translators, with little
connection to their ability. Several translation agencies were
caught up in a feverish battle to handle most of the work and
desperately competed to find translators. No one knows the
quality of the final product, but it cannot have been routinely
high. Some translators and agencies have still not been fully
paid. As the deadline drew closer, more and more boxes of
documents appeared. And as the final blow, the opposing
company's law firm also came onto the scene with boxes of its own
documents that needed translation. But these newcomers imposed
one nearly impossible condition, also for reasons of
confidentiality: no one who had translated for the first law firm
would be permitted to translate for them.
Now let us consider this true-life tale, which occurred just
three months ago, and see how--or whether--MT could have handled
things better, as is sometimes claimed. Let's be generous and
remove one enormous obstacle at the start by assuming that all
these cases of documents were in fact in machine-readable form
(which, of course, they weren't). Even if we accord MT this
ample handicap, there are still a number of problems it would
have had trouble coping with:
1. How could a sufficient number of competent post-editors
be found or trained before the deadline?
2. How could a sufficiently large and accurate MT
dictionary be compiled before the deadline? Doesn't creating such
a dictionary require finishing the job first and then
saving it for the next job, in the hope that it will be similar ?
3. The simpler Mom & Pop store & smaller agency structure
of the human translation world was nonetheless able to
field at least some response to this challenge because of its
large slack capacity. Would an enormously powerful and
expensive mainframe computer have the same slack capacity,
i.e., could it be kept inactive for long periods of
time until such emergencies occurred? If so, how would this be
reflected in the prices charged for its services?
4. How would MT companies have dealt with the secrecy
requirement, that translation must be done in the law firm's office?
5. How would an MT Company comply with the demand of the
second law firm, that the same post-editors not be used, and
still land the job?
6. Supposing the job proved so enormous that two MT firms
had to be hired--assuming they used different systems,
different glossaries, different post-editors, how
could they have collaborated without creating even
more work and confusion?
Larger Philosophical Questions
Is it in any final sense a reasonable assumption, as many
believe, that progress in MT can be gradual and cumulative in
scope until it finally comes to a complete mastery of the
problem? In other words, is there a numerical process by which
one first masters 3% of all knowledge and vocabulary building
processes with 85% accuracy, then 5% with 90% accuracy, and so on
until one reaches 99% with 99% accuracy? Is this the whole story
of the relationship between knowledge and language, or are there
possibly other factors involved, making it possible for reality
to manifest itself from several unexpected angles at once. In
other words, are we dealing with language as a linear entity when
it is in fact a multi-dimensional one?
Einstein maintained that he didn't believe God was playing
dice with the universe. Is it possible that by using AI rule-
firing techniques with their built-in certainty and confidence
values, computational linguists are playing dice with the meaning
of the that universe?
It would be possible to design a set of "Turing Tests" to
gauge the performance of various MT systems as compared with
human translation skills. The point of such a process, as with
all Turing Tests, would be to determine if human referees could
tell the difference between human and machine output. All
necessary safeguards, handicaps, alternate referees, and double
blind procedures could be devised, provided the will to take part
in such tests actually existed. True definitions for cost,
speed, accuracy, and post-editing needs might all have at least a
chance of being estimated as a result of such tests. What are
the chances of their taking place some time in the near future?
"Computerization is the first stage of the industrial
revolution that hasn't made work simpler." Does this statement,
paraphrased from a book by a Harvard Business School professor,
(2) have any relevance for MT? Is it correct to state that
several current MT systems actually add one or more levels of
difficulty to the translation process before making it any
easier?
While translators may not be able to articulate precisely
what kind of interface for translation they most desire, they can
certainly state with great certainty what they do NOT want. What
they do not want is an interface that is any of the following:
harder to learn and use than conventional translation;
more likely to make mistakes than the above;
lending less prestige than the above;
less well paid than the above.
Are these also concerns for MT developers?
What real work has been done in the AI field in terms of
treating translation as a Knowledge Domain and translators as
Domain Experts and pairing them off with Knowledge Engineers?
What qualifications were sought in either the DE's or the KE's?
Are MT developers using the words "asymptote" and
"asymptotic" in their correct mathematical sense, or are they
rather using them as buzzwords to impart a false air of
mathematical precision to their work? Is the curve their would-
be asymptote steadily approaching a representation of FAHQT or
something reasonably similar, or could it just turn out to be the
edge of a semanto-linguistic Butterfly Effect drawing them
inexorably into what Shannon and Weaver recognized as entropy,
perhaps even into true Chaos?
Must not all translation, including MT, be recognized as a
subset of two far larger sets, namely writing and human
mediation? In the first case, does it not therefore become
pointless to maintain that there are no accepted standards for
what constitutes a "good translation," when of course there are
also no accepted standards for what constitutes "good writing?"
Or for that matter, no accepted standards for what constitutes
"correct writing practices," since all major publications and
publishing houses have their own in-house style manuals, with no
two in total agreement, either here or in England. And is not
translation also a specialized subset of a more generalized form
of "mediation," merely employing two natural languages instead of
one? In which case, may it belong to the same superset which
includes "explaining company rules to new employees," public
relations and advertising, or choosing exactly the right
time to tell Uncle Louis you're marrying someone he
disapproves of? Are not the only real differences
between foreign language translation and such upscale
mediation that two languages are involved and the context is
usually more limited? In either case (or in both together), what
happens if all the complexities that can arise from superset
activities descend into the subset and also become "translation
problems?" at any time? How does MT deal with either of these
cases?
Does the following reflection by Wittgenstein apply to MT:
"A sentence is given me in code together with the key. Then of
course in one way everything required for understanding the
sentence has been given me. And yet I should answer the question
`Do you understand this sentence?': No, not yet; I must first
decode it. And only when e.g. I had translated it into English
would I say `Now I understand it.'
"If now we raise the question `At what moment of translating
do I understand the sentence? we shall get a glimpse into the
nature of what is called `understanding.'" To take
Wittgenstein's example one step further, if MT is used, at what
moment of translation does what person or entity understand the
sentence? When does the system understand it? How about the
hasty post-editor? And what about the translation's target
audience, the client? Can we be sure that understanding has
taken place at any of these moments? And if understanding has
not taken place, has translation?
Practical Suggestions for the Future
1. The process of consultation and cooperation between
working translators and MT specialists which has begun here today
should be extended into the future through the appointment of
Translators in Residence in university and corporate settings,
continued lectures and workshops dealing with these themes on a
national and international basis, and greater consultation
between them in all matters of mutual concern.
2. In the past, many legislative titles for training and
coordinating workers have gone unused during each Congressional
session in the Department of Labor, HEW, and Commerce. If there
truly is a need for retraining translators to use MT and CAT
products, it behooves system developers--and might even benefit
them financially--to find out if such funding titles can be used
to help train translators in the use of truly viable MT systems.
3. It should be the role of an organization such as MT
Summit III to launch a campaign aimed at helping people
everywhere to understand what human translation and machine
translation can and cannot do so as to counter a growing trend
towards fast-word language consumption and use.
4. Concomitantly, those present at this Conference should
make their will known on an international scale that there is no
place in the MT Community for those who falsify the facts about
the capabilities of either MT or human translators. The fact
that foreign language courses, both live and recorded, have been
deceitfully marketed for decades should not be used as an excuse
to do the same with MT. I have appended a brief Code of Ethics
document for discussion of this matter.
5. Since AI and expert systems are on the lips of many as
the next direction for MT, a useful first step in this direction
might be the creation of a simple expert system which prospective
clients might use to determine if their translation needs are
best met by MT, human translation, or some combination of both.
I would be pleased to take part in the design of such a program.
DRAFT CODE OF ETHICS
1. No claims about existing or pending MT products should
be made which indicate that MT can reduce the number of human
translators or the total cost of translation work unless all
costs for the MT project have been scrupulously revealed,
including the total price for the system, fees or salaries for
those running it, training costs for such workers, training costs
for additional pre-editors or post-editors including those who
fail at this task, and total costs of amortization over the full
period of introducing such a system.
2. No claims should be made for any MT system in terms of
"percentage of accuracy," unless this figure is also spelled out
in terms of number of errors per page. Any unwillingness to
recognize errors as errors shall be considered a violation of
this condition, except in those cases where totally error-free
work is not required or requested.
3. No claim should be made that any MT system produces
"better-quality output" than human translators unless such a
claim has been thoroughly quantified to the satisfaction of all
parties. Any such claim should be regarded as merely anecdotal
until proved otherwise.
4. Researchers and developers should devote serious study
to the issue of whether their products might generate less sales
resistance, public confusion, and resentment from translators if
the name of the entire field were to be changed from "machine
translation" or "computer translation" to "computer assisted
language conversion."
5. The computer translation industry should bear the cost
of setting up an equitably balanced committee of MT workers and
translators to oversee the functioning of this Code of Ethics.
6. Since translation is an intrinsically international
industry, this Code of Ethics must also be international in its
scope, and any company violating its tenets on the premise that
they are not valid in its country shall be considered in
violation of this Code. Measures shall be taken to expose and
punish habitual offenders.
Respectfully Submitted by
Alex Gross, Co-Director
Cross-Cultural Research Projects
P.O. Box 660--Cooper Station
New York, NY 10276
(212) 777-7609
CompuServe: 71071,1520
(1) Kimmo Kettunen, in a letter to Computational
Linguistics, vol. 12, No. 1, January-March, 1986
(2) Shoshana Zuboff: In the Age of the Smart Machine:
The Future of Work and Power, Basic Books, 1991.
Copyright 1991 and 1995 by Alexander Gross
This piece may be reproduced for
individuals and for educational
purposes. It may not be used for
any commercial (i.e., money-making)
purpose without written permission
from the author.